For example,Бобцов

Segmentation of word gestures in sign language video

Annotation

Despite the widespread use of automatic speech recognition and video subtitles, sign language is still a significant communication channel for people with hearing impairments. An important task in the process of automatic recognition of sign language is the segmentation of video into fragments corresponding to individual words. In contrast to the known methods of segmentation of sign language words, the paper proposes an approach that does not require the use of sensors (accelerometers). To segment the video into words in this study, an assessment of the dynamics of the image is used, and the boundary between words is determined using a threshold value. Since in addition to the speaker, there may be other moving objects in the frame that create noise, the dynamics in the work is estimated by the average change from frame to frame of the Euclidean distance between the coordinate characteristics of the hand, forearm, eyes and mouth. The calculation of the coordinate characteristics of the hands and head is carried out using the MediaPipe library. The developed algorithm was tested for the Vietnamese sign language on an open set of 4364 videos collected at the Vietnamese Sign Language Training Center, and demonstrated accuracy comparable to manual segmentation of video by an operator and low resource consumption, which will allow using the algorithm for automatic gesture recognition in real time. The experiments have shown that the task of segmentation of sign language, unlike the known methods, can be effectively solved without the use of sensors. Like other methods of gesture segmentation, the proposed algorithm does not work satisfactorily at a high speed of sign language when words overlap each other. This problem is the subject of further research. 

Keywords

Articles in current issue